Fast Matching of Twig Patterns

نویسندگان

  • Jiang Li
  • Junhu Wang
چکیده

Twig pattern matching plays a crucial role in xml data processing. Existing twig pattern matching algorithms can be classified into two-phase algorithms and one-phase algorithms. While the two-phase algorithms (e.g., TwigStack) suffer from expensive merging cost, the onephase algorithms (e.g., TwigList, Twig2Stack, HolisticTwigStack) either lack efficient filtering of useless elements, or use over-complicated data structures. In this paper, we present two novel one-phase holistic twig matching algorithms, TwigMix and TwigFast, which combine the efficient selection of useful elements (introduced in TwigStack) with the simple lists for storing final solutions (introduced in TwigList). TwigMix simply introduces the element selection function of TwigStack into TwigList to avoid manipulation of useless elements in the stack and lists. TwigFast further improves this by introducing some pointers in the lists to completely avoid the use of stacks. Our experiments show TwigMix significantly and consistently outperforms TwigList and HolisticTwigStack (up to several times faster), and TwigFast is up to two times faster than TwigMix.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TwigList : Make Twig Pattern Matching Fast

Twig pattern matching problem has been widely studied in recent years. Give an XML tree T . A twig-pattern matching query, Q, represented as a query tree, is to find all the occurrences of such twig pattern in T . Previous works like HolisticTwig and TJFast decomposed the twig pattern into single paths from root to leaves, and merged all the occurrences of such path-patterns to find the occurre...

متن کامل

A Hybrid Approach for General XML Query Processing

The state-of-the-art XML twig pattern query processing algorithms focus on matching a single twig pattern to a document. However, many practical queries are modeled by multiple twig patterns with joins to link them. The output of twig pattern matching is tuples of labels, while the joins between twig patterns are based on values. The inefficiency of integrating label-based structural joins in t...

متن کامل

MARS: A Matching and Ranking System for XML Content and Structure Retrieval

Structural queries specify complex predicates on the content and the structure of the elements of tree-structured XML documents. Recent works have typically applied top-down decomposition of the twig patterns into (i) parent-child or ancestordescendant relationships, or (ii) path expression queries, and then followed by a join operation to reconstruct matched twig patterns. This demonstration s...

متن کامل

QuickStack: A Fast Algorithm for XML Query Matching

With the increasing popularity of XML for data representation and exchange, much research has been done for providing an efficient way to evaluate twig patterns in an XML database. As a result, many holistic join algorithms have been developed, most of which are derivatives of the well-known TwigStack algorithm. However, these algorithms still apply a two phase processing scheme: first identify...

متن کامل

Prefix Path Streaming: a New Clustering Method for XML Twig Pattern Matching

Searching for all occurrences of a twig pattern in a XML document is an important operation in XML query processing. Recently a class of holistic twig pattern matching algorithms has been proposed. Compared with the prior approaches, the holistic method avoids generating large intermediate results which do not contribute to the final answer. The method is CPU and I/O optimal when twig patterns ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008